The accuracy of an electronic nose to diagnose tuberculosis in patients referred to an expert centre

Introduction An electronic nose (eNose) device has shown a high specificity and sensitivity to diagnose or rule out tuberculosis (TB) in the past. The aim of this study was to evaluate its performance in patients referred to INERAM. Methods Patients aged ≥15 years were included. A history, physical examination, chest radiography (CRX) and microbiological evaluation of a sputum sample were performed in all participants, as well as a 5-minute breath test with the eNose. TB diagnosis was preferably established by the gold standard and compared to the eNose predictions. Univariate and multivariate logistic regression analyses were performed to assess potential risk factors for erroneous classification results by the eNose. Results 107 participants with signs and symptoms of TB were enrolled of which 91 (85.0%) were diagnosed with TB. The blind eNose predictions resulted in an accuracy of 50%; a sensitivity of 52.3% (CI 95%: 39.6–64.7%) and a specificity of 36.4% (CI 95%: 12.4–68.4%). Risk factors for erroneous classifications by the eNose were older age (multivariate analysis: OR 1.55, 95% CI 1.10–2.18, p = 0.012) and antibiotic use (multivariate analysis: OR 3.19, 95% CI 1.06–9.66, p = 0.040). Conclusion In this study, the accuracy of the eNose to diagnose TB in a tertiary referral hospital was only 50%. The use of antibiotics and older age represent important factors negatively influencing the diagnostic accuracy of the eNose. Therefore, its use should probably be restricted to screening in high-risk communities in less complex healthcare settings.


Introduction
Although tuberculosis (TB) may seem a 'silent pandemic' compared to COVID-19, it is responsible, yearly, for 10 million cases and 1.2 million notified deaths [1,2]. TB elimination is complex. On the one side due to different elements in the 'cascade of care' to establish the diagnosis but also because of factors concerning the bacteria itself as slow duplication and drug resistance which complicates treatment [1].
The "holy grail" to diagnose TB is a portable and rapid 'point-of-care' test providing a result within minutes [3]. Such a rapid test will never replace the current gold standard (positive culture of Mycobacterium tuberculosis complex) but may be used as a screening tool to either rule out TB or detect individuals with active disease. All conventional diagnostic methods, such as chest radiography (CXR), or sputum based test as Ziehl Neelsen (ZN) staining, GeneXpert MTB/RIF and culture of MTB are either expensive, time consuming, not widely available, and labour intensive [1].
An electronic nose (eNose) is a small, portable and a potentially rapid diagnostic 'point-ofcare' device, that can detect gases in human breath and compares vectors derived from volatile organic compound (VOCs) from exhaled breath samples to a machine learning algorithm to diagnose a patient with TB [4]. Analysers for exhaled breath can be separated in two groups: 1. Systems for detecting and measuring a set of specific VOC's. Usually, these systems comprise of a mobile unit for collecting the gas sample and another (fixed) unit for analysing it and separating the individual components. Separation techniques typically include GC-MS or IMS. 2. Electronic noses, often portable, possessing multiple different a-specific sensors (e.g. metaloxide or conducting polymer) for measuring an integrated breath profile. In this case, Machine Learning techniques are applied for separating breath profiles of sick and healthy individuals. In order to accomplish this, a classifier (e.g. Artificial Neural Network, Random Forest, Support Vector Machine) has to be trained accordingly [5]. Until now, several studies have been performed with the eNose, which showed promising results [6][7][8][9]. In Paraguay, the eNose has been tested in adults (>18 years) with pulmonary TB, asthma or COPD and healthy controls, showing a high sensitivity (91%) and specificity (93%) to diagnose TB [10]. The eNose was also tested in an indigenous population, demonstrating its usefulness as a rule-out TB test in a remote area [11]. Additional studies testing the eNose in different settings are needed.
The aim of this study was to assess the accuracy of the eNose to classify individuals with signs and symptoms compatible with TB (in-and outpatient clinic) referred to the Paraguayan National Reference Center for respiratory diseases and TB (INERAM).

Study design and setting
A prospective study was performed in hospitalized and ambulatory patients from January 2016 to December 2017.

Participants
Participants, aged � 15 years, presenting with respiratory symptoms for more than 15 days or having started anti-TB treatment already (<3 days before inclusion) were included.
Authorization by parents and/or guardians was provided for participants under 18 years old. In Human Immunodeficiency Virus (HIV) positive individuals, the duration of respiratory symptoms was not considered. Exclusion criteria were respiratory failure or not willing or unable to sign informed consent.

Data collection
Patient demographics and clinical data (age, gender, BMI), information on co-morbidities (HIV, diabetes mellitus, asthma, chronic obstructive lung disease (COPD), or arterial hypertension), smoking and alcohol habits, drug abuse, co-medication (antibiotic use and/or TB treatment), food and beverage intake, and dental status (good condition: complete parts, no cavities or seals; regular: complete parts, presence of seals; bad: incomplete parts, presence of caries, incomplete seals) were recorded. Physical examination, a CXR, microbiological examination of sputum or biopsies and breath sampling with the eNose were performed in all participants. Microbiological examinations and breath tests conducted on the same day. TB symptoms (cough, fever, dyspnoea, night sweat, lymphadenopathy and/or haemoptysis) were noted. The study physicians assessed the CXRs for cavities, infiltrates, atelectasis and pleural effusion. All participants provided at least one sputum sample (produced spontaneously or induced by nebulization of 4 mL of hypertonic saline). ZN staining and mycobacterial culture (solid Ogawa-Kudoh medium) were performed in the laboratory of INERAM. In some cases, GeneXpert was performed according to the national guideline. The GeneXpert MTB/RIF1 (Cepheid) samples were processed by the National Reference Laboratory (LCSP). The sputum samples were also evaluated for mycosis and common pathogens. Pulmonary TB (PTB) diagnosis was established by gold standard (positive culture of MTB complex). In case of a negative culture result, diagnosis was set by other strong supporting evidence (e.g. ZN positive, and/or MTB detected by GeneXpert [12]. In case no bacteriological prove was found a diagnosis was called a 'clinical diagnosis'. In this last category, the improvement of clinical symptoms with anti-TB treatment was evaluated after two months (the initial phase of treatment). Extra-pulmonary TB (EPTB) patients were diagnosed by microbiological confirmation (positive culture of MTB) or with histopathological or chemical evidence supporting TB (in biopsies, pus or pleural fluid), or showing improvement of clinical condition after two months of anti-TB treatment. All participants were followed at the outpatient clinic. TB patients received anti-TB treatment according to the national guideline [13].

Breath sampling with the eNose
All participants underwent five-minute breath sampling using a nose clamp, through the eNose (Aeonose TM device, Zutphen, The Netherlands). The breath test measurements were done in the morning, always in the same place, in a room free of odours of gasses and alcohol and without dust. Cleaning of the eNose device on the outside was not performed unless it was needed because of stains on the device from a previous user. After every breath sampling a clean burn (approximately 10 minutes) was performed inside the device by heating of the metal sensors to 280 degrees Celsius, as part of the complete cycle process. Without this clean burn the eNose device cannot be used for another participant. Time of last meal/beverage in take/ medication or cigarette were recorded. All participants used a nose clamp and provided a breath sample by in-and exhaling for 5 min, through the Aeonose TM . The same device was used throughout the whole study. During sampling process, 36 measurement cycles, each containing 64 data points, were recorded per sensor. By performing this, each patient's measurement comprised a data matrix with thousands of records. Proper reproducibility of the results is possible due to the sensor's temperature control. However, even for sensors produced on the same wafer, thickness and ageing differences can cause small variations between sensors and AeonoseTM devices over time. This phenomenon is cope by normalizing the data. The data is then compressed using a Tucker 3-like algorithm. This results into a vector of 10 components per patient, redundant information and noise are remove. These resulting vectors and the results of the classification are used to train an artificial neural network (ANN). A double cross-validation was applied using the Leave-10%-Out method to minimize the risk of systematic errors [10].
The analysis of the breath samples was done by The eNose Company. The company did not take part in study design and logistics. To establish a new 'training data set' to train and optimize the previously used neural network (NN) from our calibration study [10], 31 de-blinded participants from this new cohort were added to the former training data set. For full description of the training and internal validation of the NN; see previous published research [6].
The data of the remaining study participants (n = 76) was kept blind to The eNose Company. These participants of the 'blind data set' were analysed in the optimized NN and then classified by the eNose Company as "TB yes" or "TB no".

Statistical analysis
Statistical analysis was performed by using SPSS version 25.0 software (IBM Corporation, Armonk, NY, USA). Continuous variables were expressed as the mean (with standard deviation) or median (with range), and categorical variables were expressed as frequency (with percentage). Univariate testing of continuous variables was performed using Student t-test in case of normal distribution, and non-parametric tests for non-normal distribution. Categorical variables were analysed using X 2 testing or Fisher's exact test. Binary logistic regression analysis was performed to identify possible risk factors for wrong predictions in the NN algorithm. Variables with a p value �0.10 in univariate analysis were selected for multivariate logistic regression analysis. To prevent overfitting, the number of variables in the multivariate analysis was restricted by the number of patients with the outcome. All tests were two-sided, and a p value of � 0.05 was considered statistically significant.

Inclusivity in global research
Additional information regarding the ethical, cultural, and scientific considerations specific to inclusivity in global research is included in the S1 Questionnaire.

Results
During the study period, 107 participants were enrolled with a mean age of 37 years and 70% being male (Table 1). TB was diagnosed in 91/107 (85%) of the participants. The most common TB characteristic was cough (87.9%), followed by fever (77.6%), dyspnoea (73.8%) and night sweat (72.0%). Four HIV positive TB patients were included and were on antiretroviral therapy. Fifty-six participants (52.3%) started TB treatment for �3 days when included in this study and providing the breath sample. Two participants received TB treatment in the past (>3 years ago). Table 2 shows microbiological outcomes of all participants. TB diagnosis was established by gold standard in 67 cases (74%). Twenty-six percent (24/91) were diagnosed differently; five PTB with ZN positive sputum only and another five were 'clinical diagnosis'. Fourteen patients with EPTB (pleural or lymph node) also lacked bacteriological confirmation. The remaining group of 16/107 (15.0%) participants had an alternative diagnosis (e.g. pneumonia or chronic obstructive pulmonary disease (COPD) exacerbation).

Discussion
We evaluated the performance of the eNose in a referral hospital to diagnose TB in ill patients presenting with respiratory symptoms. We showed that in this study the accuracy of the eNose device was disappointingly only 50%. In this setting the eNose did not perform well enough according to the requisites for new screening tools or tests [3]. Older age and the use of antibiotics are significant risk factors for an incorrect prediction by the eNose with a very high OR for antibiotics use before TB diagnosis. VOCs are unique to an individuals' metabolism and may provide valuable information about the health condition of a subject. However, both host factors (genetics, co-morbidity, type of pathogen) and external factors (medication, toxic habits, food intake) may alter their breath prints, which could possibly result both in FN or FP classifications of the NN [14][15][16]. The use of antibiotics in a person with signs and symptoms compatible with TB is known to negatively influence the accuracy of conventional methods to diagnose TB, resulting in diagnostic delays in daily practice [17][18][19][20][21][22]. In our study, the use of antibiotics also significantly influenced the accuracy of the eNose device in a negative way. Even though there are several studies showing that co-medication influences breath prints, to the best of our knowledge, there are no publications specifically indicating the influence of antibiotics to an individuals' VOCs.
We found that older age of the patients was also a risk factor for incorrect prediction by the eNose. Several studies have shown an association between age and breath prints [23][24][25][26]. The precise impact of ageing on metabolism is a topic that is widely discussed in the literature. It is not unreasonable to expect that this factor influences the concentrations of some metabolites and thereby altering breath prints [16].
In contrast to antibiotic use, anti-TB medication (< 3 days before inclusion of the participant) did not have a statistically significant impact on the predictive value of the eNose in this study. This may be explained because active TB is a disease that develops slowly over the course of weeks to months and requires a long duration of treatment to sterilize the patient from the bacteria. For that reason, it is understandable that a patients' metabolism is not changing rapidly within the course of a few days. Therefore, the recent start of anti-TB treatment did not negatively influence the eNose predictions in our study cohort. We also did not find any association between BMI and incorrect predictions, even though there is evidence available that patients with a high BMI have different breath prints and a higher risk of falsepositive test results compared to patients with a normal or low BMI [10,23,27,28]. Dental status, time to last meal/ drinks, smoking and drug use neither appeared to be risk factors for a wrong prediction by the eNose in this study. Our study has some limitations. Firstly, the sample size was small and consisted only of a small group of patients with an alternative diagnosis. Therefore, we might have introduced a design bias as the new NN 'training data set' might not have had enough pneumonia patients to train the new diagnostic algorithm correctly as the previous cohort consisted of PTB patients, patients with obstructive airway disease and healthy controls. Analysing larger cohorts will increase the accuracy of the eNose by establishing a more robust neural network algorithm. Secondly, the gold standard to establish TB diagnosis was not accomplished in all patients (mainly patients with EPTB) and it is possible that these patients in fact do not have active TB and the eNose classification was correctly made.

Conclusion
In this study, the accuracy of the eNose to diagnose TB in a tertiary referral hospital was only 50%. Factors associated with wrong predictions by the eNose were antibiotic use and older age of the participants. For complex referral patients who mostly have used antibiotics to rule out a simple pneumonia first, as recommended by the WHO, the eNose showed not to be useful. The eNose is likely to be more suitable as a TB rule-out test in less complex healthcare settings.